[Security] Remove vulnerable ujson package#1757
Open
kukgini wants to merge 4 commits into
Open
Conversation
Signed-off-by: pstlouis <patrick.st-louis@opsecid.ca>
Signed-off-by: pstlouis <patrick.st-louis@opsecid.ca>
Signed-off-by: pstlouis <patrick.st-louis@opsecid.ca>
c018850 to
9362661
Compare
Author
aac207a to
57a3f25
Compare
57a3f25 to
733ec9d
Compare
Fixes layered on top of hyperledger-indy#1676's ujson removal. Dropping the dependency resolves the four ujson CVEs affecting the previously pinned 1.33 (CVE-2022-31116, CVE-2022-31117, CVE-2026-44660, CVE-2026-54911; CVE-2021-45958 cited in hyperledger-indy#1676 only affects ujson >= 1.34). - channel.py: the lint fix in hyperledger-indy#1676 replaced `type(msg) != tuple` with `not isinstance(type(msg), tuple)`, which is always True and wrapped already-tuple messages into nested tuples, breaking handler routing. Restore the original semantics in a lint-clean form (`type(msg) is not tuple`), preserving NamedTuple wrapping. - zstack serializeMsg: ujson emitted compact JSON; stdlib json's default separators add whitespace, growing every wire message and breaking the exact-size assertions in stp_zmq/test/test_zstack.py calibrated to MSG_LEN_LIMIT. Pass separators=(',', ':') to keep the wire format byte-compatible. (Applied to scripts/test_zmq's copy as well.) - recorder: SimpleZStackWithRecorder records raw wire frames; ujson silently encoded bytes as UTF-8 strings while stdlib json raises TypeError, so recording mode (STACK_COMPANION=1) crashed on the first message. Add a shared bytes-decoding default hook (plenum.common.util.json_default_bytes_to_str) and a bytes regression test. - Add regression tests pinning JsonSerializer's exact byte output (key ordering, compact separators, raw-UTF-8 non-ASCII, float repr, int-key coercion, top-level-bytes base64) since it feeds signing/hashing, plus characterisation of the nested-bytes TypeError. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: kukgini <kukgini@gmail.com>
733ec9d to
ebdbb3f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Removes the
ujsondependency and replaces it with the Python standard libraryjsonmodule across serialization, recorder, and zstack code paths.Per OSV / GitHub Advisory DB, the previously pinned
ujson==1.33is affected by four known vulnerabilities, all resolved by the removal:ujson.dump()on write failure (<= 5.12.0)ujson.dumps()(<= 5.12.1)(CVE-2021-45958, cited in #1676, does not apply to the pinned version: its vulnerable range is
>= 1.34, < 5.2.0. Removing the dependency moots it regardless.)This continues the stalled work from #1676 by @PatStLouis:
main(the previously requested rebase).channel.py).Ref: https://osv.dev/list?ecosystem=PyPI&q=ujson / https://security.snyk.io/package/pip/ujson
Changes
ujsonfromsetup.py, build scripts, and dev-setup package lists.common/serializers/json_serializer.py: remove thetry: import ujsonbranch; promote the existingOrderedJsonEncoder(stdlibjsonwithsort_keys=True,separators=(',', ':'),ensure_ascii=False) to be the sole encoder, preserving sorted-key / compact output.try: import ujson as json / except ImportError: import jsonfallbacks inrecorder.py,test_recorder.py,stp_zmq/zstack.py, andscripts/test_zmq/.../zstack.pywith a plainimport json.stp_zmq/zstack.py(and thescripts/test_zmqcopy):serializeMsgnow passesseparators=(',', ':'). ujson emitted compact JSON; stdlib's default separators insert whitespace, which grows every node-to-node wire message and breaks the exact-size assertions instp_zmq/test/test_zstack.py:242/284/329(calibrated toMSG_LEN_LIMITwith the 8-byte{'k':''}overhead of the compact form).plenum/recorder/recorder.py:add_to_storenow passesdefault=json_default_bytes_to_str(a shared helper added toplenum/common/util.py) sobytes/bytearrayare decoded to UTF-8 strings.SimpleZStackWithRecorderrecords raw wire frames — both the message and the ZMQ identity arrive as undecodedbytes— which ujson silently encoded as UTF-8 strings but stdlibjson.dumpsrejects withTypeError, so recording mode (STACK_COMPANION=1) crashed message servicing on the first frame. A bytes regression test is added (existing recorder tests only fedstr).Bug fix beyond #1676
In
plenum/common/channel.py, thelinting errorscommit of #1676 changed:The intent was to silence flake8 E721 (
do not compare types), which the originaltype(msg) != tupletriggers (E721 is not in this repo's.flake8ignore list). However the replacement is alwaysTrue:type(msg)is a class object, and a class object is never an instance oftuple. As a result, messages that were already tuples got wrapped again into a nested tuple — breaking handler routing in_process_sync— and it directly contradicts the adjacent comment explaining whyisinstancemust not be used here (it also matchesNamedTuple).This PR keeps the lint fix but does it correctly:
E721's own message recommends
is/is notfor exact type checks, so this form silences E721 and preserves the original semantics — it is not a revert of the linting fix. Verified with flake8 (--select=E721reports no violation) and againsttuple,NamedTuple,dict,str, andintinputs, where the result matches the original!=behavior in every case:type(msg) != tuple(pre-#1676)not isinstance(type(msg), tuple)(#1676)type(msg) is not tuple(this PR)Serialization compatibility
The previously pinned
ujson==1.33does not support thesort_keyskwarg — its C argument list is onlyobj, ensure_ascii, double_precision, encode_html_chars(verified against the 1.33 sdist from PyPI). The guard probe injson_serializer.py(uencode({...}, sort_keys=True)) therefore always raisedTypeErrorand fell through to the stdlibOrderedJsonEncoderfallback. In other words: the signing/hashing serializer was already running on stdlibjsonin every standard install, and its byte output is unchanged by this PR. The newcommon/test/test_json_serializer_ujson_compat.pypins that byte format going forward (key ordering, compact separators, raw-UTF-8 non-ASCII, float repr, int-key coercion, top-level-bytes base64, and the nested-bytesTypeErrorcharacterisation).The code paths where ujson was active are the ones without the probe (plain
try: import ujson as json): the recorder and zstack. The two behavioral differences there — compact vs whitespace-padded output, and silent bytes-to-string coercion — are exactly what theseparators=(',', ':')and recorderdefault=fixes above restore.Notes
pysha3remains out of scope (per discussion in [Security] remove ujson package (CVE-2022-31116, CVE-2022-31117, CVE-2021-45958) #1676).🤖 Generated with Claude Code